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CRYPTOGRAPHIC DATA PROCESSING SYSTEMS, COMPUTER PROGRAM 
PRODUCTS, AND METHODS OF OPERATING SAME IN WHICH A SYSTEM 
MEMORY IS USED TO TRANSFER INFORMATION BETWEEN AHOST 
PROCESSOR AND AN ADJUNCT PROCESSOR 

CROSS-REFERENCE TO PROVISIONAL APPLICATIONS 
This application claims the benefit of Provisional Application Serial No. 
60/203.409, filed May 1 1 , 2000, entitled Cryptographic Acceleration Methods and 
Apparatus, and Provisional Application Serial No. 60/203.465, filed May 1 1, 2000, 
entifled Methods and Apparatus for Supplying Random Numbers, the disclosiires of 
5 which are hereby incorporated herein by reference in their entirety as if set forth fiiUy 
herein. 

BACKGROUND OF THE INVENTION 
The present invention relates generally to the field of data processing systems, 
10 and, more particularly, to cryptographic data processing systems, computer program 
products, and methods of operating same. 

Signal processors and integrated circuit chips have been developed to 
accelerate cryptographic operations, such as pubUc key operations. Examples of such 
chips include, but are not limited to, the Hifii 6500 avaflable from Hifii, Inc., the 
15 SafeNet ADSP 2141 available fix)m SafeNet, Inc., and the Rainbow Mykolronx 
FastMAP available fix)m Rainbow Mykotronx, Inc. Despite the availability of 
cryptographic accelerator products, there remains room for improvement in the art. 

For example, conventional cryptographic data processing systems generaUy 
use two-main-mfethT)ds-for issuing a-command to a cryptographic accelerator. The first 
20 method involves the provision of a command register on the cryptographic accelerator 
that a host processor uses to issue a single command. Once the cryptographic 
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accelerator completes executing a command, the host processor may issue a new 
command. After completing a command, the cryptographic accelerator is generally 
idle vmtU the host processor issues a new command. Unfortunately, the host processor 
may spend much time interacting directly with the cryptographic accelerator to 
5 download data and issue commands. This may reduce the amount of time available to 
the host processor for attending to other tasks. 

The second method allows the host processor to download one or more 
command sequences to the cryptogr^hic accelerator and then to instruct the 
cryptographic accelerator to execute one or more of the downloaded command 
10 sequences. After completing a command sequence, the cryptographic accelerator is 
generally idle until the host processor issues a new command. The size of the 
command sequences may be limited based on the amount of memory that may be 
placed on the cryptographic accelerator. Like the first method, the host processor 
may spend much time interacting directly with the cryptographic accelerator to 
15 download data and issue command sequences. This may reduce the amount of time 
available to the host processor for attending to other tasks. 

Cryptographic accelerators generally perform operations using one or more 
operands. These devices may include general-purpose operand storage that comprises 
fixed length registers to store the operands and results. To execute an instruction, a 
20 register number is used to indicate which operand should be used for the operation 
and where the output should be stored. For example, if the operation were "a + b = 
c," then part of the instruction would indicate that "a" is in register 7, "b" is in register 
1, and "c" should be put into register 2. 

Because the registers are fixed in size and the operands and results are variable 
25 in size, the size of the operands wiU always be less than or equal to the register size. 
As a result, some of the memory in the registers may be wasted. This reduces the 
number of operands that may be stored on a chip in a given amount of space. In 
addition, if the cryptographic accelerator is redesigned to acconunodate larger 
operands, then each of the registers may need to be modified. More registers may be 
30 designed into a cryptographic accelerator; however, adding more memory to a 

cryptographic accelerator may reduce the amount of othar functionality that may be 
included and/or- increase the cost- • 

Cryptographic processors and/or otha- types of signal processors and 
integrated circuits may use a hardware-based random number generator. Various 
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conventional metbods may be used to retrieve random numbers from an integrated 
circuit incorporating a random number generator. One method is for the random 
number generator to provide one or more data registers that a host processor may read 
to obtain random numbers. The host processor may tell the random number generator 
5 to Fovide more random data before or after retrieving random data from the registers. 
The raiidom number generator may generate the random data m the background so 
that random data may be avaUable when needed by the host processor. 

Another method for obtaining random data is for the host processor to request 
a sample of random data from ihe random number generator. The host processor may 
10 provide the random number generator with a request that specifies an amount of 

random data and a location in memory where the random data should be placed. The 
random number generator may then generate the random data and transfer the random 
data to the requested location in the background. 

Unfortunately, by providing random data through data registers on the random 
15 number generator or other integrated circuit chip, any buffer management that may be 
desiied is generally performed by the host processor. Moreover, the bus that comiects 
the host processor with the random number generator may be used inefficiently 
because single data reads are typicaUy used instead of block reads. If a host processor 
requests a block of random data, however, then the host processor may initiate the 
20 data transfers and any desired buffer management that may be desired is generaUy 
performed by the host processor. The foregoing operations may be performed in the 
background and/or a fast host processor may be used; however, a faster host processor 
may increase system costs. 

25 SUMMARY OF THE INVENTION 

Embodiments of the present invention provide cryptographic data processing 
systems, computer program products, and methods of operating same in which system 
memory is used to transfer information between a host processor and an adjunct 
processor. For example, in accordance with embodiments of tiie present invention, 

30 cryptographic data processing systems comprise a host processor, a system memory 
coupled to the host processor, and a cryptographic processor integrated circuit that 
comprises a local memory.- One or more operands are downloaded into the local 
memory from the system memory, using, for example, a load command, and one or 
more operations are performed on at least one of the downloaded operands to generate 
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a result in the local memory. Hie result that is generated in the local memory is then 
stored in the system memory, using, for example, a store command. Advantageously, 
interaction between the host processor and the cryptographic processor may be 
reduced as the host processor need not consume processing time downloading 
5 operands to the cryptographic accelerator processor and/or uploading results ftom the 
cryptographic processor to system memory. 

In accordance with further embodiments of the present mvention, interaction 
between the host processor and the cryptographic processor may be further reduced 
and overall system performance improved by providing a command queue mthe 
10 system memory, loading a command block into the command queue usmg Ihe host 
processor, executing the command block using the cryptographic processor, and 
notifying the host processor that the command block has been executed. By loading 
commands into a command queue in the system memory where the cryptographic 
processor may retrieve them for processing, the host processor need not spend time 
1 5 interacting directly with the cryptographic processor. 

In accordance with further embodiments of the present invention, the host 
processor may be notified that the command block has been executed in various ways. 
For example, the cryptographic processor may invoke an interrupt to notify the host 
processor that the command block has been executed. In particular embodiments, the 
20 interrupt may be mvoked if the host processor has requested notification via an 
mterrupt in an interrupt field of the command block, hi other embodunents, the 
cryptographic processor may update a completion field in the command block to 
notify the host processor that the command block has been executed, hi still other 
embodunents, a periodic mtemipt may be defined such that when the interrupt occurs 
25 the host processor reads the completion fields of any command blocks. 

In accordance with still fiuther embodunents of the present mvention, 
improved utilization of the system memory may be attamed by re-using at least a 
portion of a command block that contains mput data to store a result or output that is 
generated by an adjunct processor, such as the cryptographic processor. For example, 
30 a command queue may be provided m the system memory and a command block may 
be loaded into the command queue usmg the host processor. The command block 
comprises an mput data field that contams mput data. The adjunct processor performs, 
an operation based on tiie mput data to generate a result and this result is stored in the 
mput data field such that at least a portion of the input data is overwritten. 
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Advantageously, the memory reserved for the command block in the system memory 
may be reduced because additional storage space need not be reserved to store the 
result of executing the command block either in the command block or elsewhere in 
the system memory. 

5 Cryptographic processors and/or other types of signal processors and 

integrated circuits may use a hardware-based random number generator. In 
accordance with further embodiments of the present invention, random number 
samples may be provided for use by the host processor while reducing interaction 
between the host processor and the cryptographic processor. For example, a random 

10 number data queue in the system memory may be provided that has a read address 
and a write address associated therewith. The cryptographic processor loads a 
random number sample mto the random number data queue at the write address and 
the host processor reads the random number sample begmning at the read address. 
The host processor need only interact with the cryptographic processor to update the 

15 read address and to check the value of the write address when the read address 

approaches the last value the host processor has for the write address. In addition, the 
cryptographic accelerator processor may manage the buffering of the random number 
samples, which may conserve processor cycles of the host processor. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

Other features of the present invention will be more readUy understood from 
the foUowing detailed description of specific embodiments thereof when read in 
conjunction with the acconqjanjing drawings, in which: 

EIG. 1 is a block diagram that iUustrates cryptographic data processing 
25 systems, computer program products, and methods of operating same in accordance 
with embodiments of the present invention; 

FIG. 2 is a flowchart that illustrates operations of cryptogr^hic data 
processing systems and computer program products m accordance with embodiments 

of the present invention; 
30 FIGS. 3 - 5 are block diagrams that illustrate functional execution units of a 

cryptographic accelerator processor in accordance with embodiments of the present 
invention; 
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FIG. 6 is a flowchart that iUustrates operations of cryptographic data 
processing systems and computer program products in accordance with further 
embodiments of the present invention; 

FIG. 7 - 8 are block diagrams that iUustrate an encryption/authentication 
5 command queue and apublic key command queue, respectively, in accordance with 
embodiments of the present invention; 

FIGS. 9 - 11 are flowcharts that iUustrate operations of cryptogr^hic data 
processing systems and computer program product^ in accordance with further 
embodiments of the present invention; 
10 FIGS. 12A - 12D are block diagrams that illustrate command blocks in 

accordance with embodiments of the present invention; 

FIG. 13 is a flowchart that illustrates operations of cryptographic data 
processing systems and computer program products in accordance with further 
embodiments of the present invention; 
1 5 FIGS. 14A, 14B, and 15 are block diagrams that illustrate command blocks in 

accordance with further embodiments of the present invention; 

FIGS. 16 and 17 are flowcharts that illustrate operations of cryptographic data 
processing systems and computer program products in accordance with fuyther 
embodiments of the present invention; 
20 FIG. 18 is a block diagram that iUustrates a random number generator data 

queue in accordance with embodiments of the present invention; 

FIG. 19 is a flowchart that iUustiates operations of cryptographic data 
processing systems and computer program products in accordance with further 
embodiments of the present invention; 
25 FIG. 20 is a block diagram that iUustrates a command interfece for a 

conventional appUcation specific integrated ckcuit; 

FIG. 21 is a block diagram tiiat Ulustrates paraUel command interfaces for an 
appUcation specific integrated circuit in accordance witii embodiments of the present 
invention; 

FIG. 22 is a block diagram of a cryptographic accelerator processor in which 
command interfece managers are respectively associated witii functional execution 
units in accordance witii embodiments of the present mvention; and 
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BIG. 23 is a flowchart that aiustrates operations of cryptographic data 
processing systems and computer program products in accordance with further 
embodiments of the presMit mvention. 

5 DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

WhUe the invention is susceptible to various modifications and alternative 
forms, specific embodiments thereof are shown by way of example in the drawmgs 
and mil herein be described in detail. It should be understood, however, that there is 
no intent to limit the mvention to the particular forms disclosed, but on the contrary, 

10 the invention is to cover all modifications, equivalents, and alternatives felling within 
tlie spirit and scope of the invention as defined by the claims. Like reference numbers 
signify like elements throughout the description of the figures. It will also be 
understood that when an element is referred to as being "comiected" or "coupled" to 
another element, it.can be directly connected or coupled to the other element or 

15 intervening elements may also be present. In contrast, when an element is referred to 
as being "directiy connected" or "directiy coupled" to another element, there are no 

intervening elements present. 

The present mvention may be embodied as metiiods, data processing systems, 
and/or computer program products. Accordingly, the present invention may be 

20 embodied in hardware and/or in software (including firmware, resident software, 
micro-code, etc.). Furthermore, the present invention may take the form of a 
computer program product on a computer-usable or computer-readable storage 
medium having computer-usable or computer-readable program code embodied in tiie 
medium for use by or in connection with an instruction execution system. In tiie 

25 context of this document, a computer-usable or computer-readable medimn may be 
any medium that can contain, store, communicate, propagate, or transport the program 
for use by or in connection with the insbwction execution system, q)paratus, or 
device. 

TTie computer-usable or computer-readable medium may be, for example but 
30 not limited to, an electronic, magnetic, optical, electtomagnetic, infirared, or 

semiconductor system, apparatus, device, or propagation medium. More specific 
examples (a nonexhaustive list) of the computer-readable medium would include the 
following: an electrical connection having one or more wires, a portable computer 
diskette, a random access memory (RAM), a read-only memory (ROM), an erasable 
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prograiDinable read-only memory (EPROM or Flash memory), an optical fiber, and a 
portable compact disc read-only memory (CD-ROM). Note that the computer-usable 
or computer-readable medium could even be paper or another suitable medium upon 
which the program is printed, as the program can be electronically captured, via, for 
5 instance, optical scanning of the paper or other medium, then compiled, interpreted, or 
otherwise processed in a suitable manner, if necessary, and then stored in a computer 
memory. 

Referring now to FIG. 1, an exemplary cryptographic data processing system 
12, m accordance with embodiments of the present invention, comprises a 

10 cryptographic accelerator processor 14, a host processor 16, a cache memory 18, a 
system memory 22, and a system bus controller 24, such as a north-bridge system 
controller. The system bus controller 24 couples the host processor 16 to the cache 
memory 18 and the system memory 22, and also couples the host processor 16 and the 
system memory 22 to the cryptographic accelerator processor 14 via a system bus 26. 

15 which may be, for example, a peripheral component interconnect (PCI) bus. The host 
processor 16 may be, for example, a commercially available or custom 
microprocessor. The system memory 22 is representative of an overall hierarchy of 
memory devices containing the software and data used to implement the fimctionaUty 
of the cryptographic data processing system 12. The system memory 22 may include, 

20 but is not limited to, the following types of devices: ROM. PROM, EPROM, 
EEPROM, flash, SRAM, and DRAM. 

In accordance with embodiments of the present invention, the cryptographic 
accelerator processor 14 comprises a random number generator (RNG) execution unit 
28, an encryption/authentication (E/A) execution unit 32, and a public key (PK) 

25 engine execution unit 34, which are coupled to a local memory 36 via a local bus 38. 
In accordance witfi particular embodiments of the present invention, the system 
memory 22 contains a random number (RN) data queue 42, an E/A command queue 
44, a PK command queue 46, and data buffer(s) 47. 

Al&ough FIG. 1 illustrates an exemplary cryptographic data processing 

30 system architecture, it wiU be understood that the present invention is not limited to 
such a configuration, but is intended to encompass any configuration capable of 
carrying out operations described herein^ Computer program code for carrying out 
operations of embodiments of the cryptographic data processing system 12 may. be 
written in a high-level programming language, such as C or C++, for development 
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convenience. Nevertheless, some modules or routines may be written in assembly 
language or even micro-code to enhance perfomiance and/or memory usage. It will 
be further appreciated that the functionality of any or all of the program modules may 
also be implemented using discrete hardware components, a single application 
5 specific inte^ted circuit (ASIC), or a programmed digital signal processor or 
microcontroller. 

The present invention is described hereinafter with reference to flowchart 
and/or block diagram iUustrations of methods, data processing systems, and/or 

computer program products in accordance with exemplary embodiments of the 
10 invention. It will be understood that each block of the flowchart and/or block diagram 
illustrations, and combmations of blocks in the flowchart and/or block diagram 
illustmtions. may be implemented by computer program instructions. These computer 
program instructions may be provided to aprocessor of a general puipose computer, a 
special purpose computer, or other programmable data processing apparatus to 
15 produce a machine, such that the instructions, which execute via the processor of the 
computer or other programmable data processing apparatus, create means for 
implementing the fimctions/acts specified in the flowchart and/or block diagram block 
or blocks. 

These computer program instructions may also be stored in a computer usable 
20 or computer-readable memory that may direct a computer or other programmable data 
processing apparatus to function in a particular manner, such that the instructions 
stored in the computer usable or computer-readable memory produce an article of 
manufacture including instructions that implement the function/act specified in the 
flowchart and/or block diagram block or blocks. 
25 The computer program instructions may also be loaded onto a computer or 

other programmable data processing apparatus to cause a series of operational steps to 
be performed on the computer or other programmablB apparatus to produce a 
computer implemented process such that the instructions that execute on the computer 
or other programmable apparatus provide steps for implementing the fimctions/acts 
30 specified in the flowchart and/or block diagram block or blocks. 

Exemplary operations of cryptographic data processing systems, computer 
program products, and methods of operating same, in accordance with embodiments 
of the present invention, will be described hereafter. Referring now to FIG. 2, the 
host processor 16 loads a command block into one of the command queues 44 and 46 

9 
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at block 52. Tte cryptographic accelerator processor 14 may be notified by the host 
processor 16 that the command block is avaUable for processing or may periodically 
access the command queues 44, and/or 46 to determine if a command block is 
available for processing. The cryptographic accelerator processor 14 downloads the 
5 command block from one of the command queues 44 and 46 and executes the 
command block at block 54. Once the cryptographic accelerator processor 14 
completes execution of the command block, the host processor 16 is notified at block • 
56. Thus, according to embodiments of the present invention, the host processor 16 
need not spend time interacting directly with the cryptographic accelerator processor 
10 14 (e.g., issuing a conunand to the cryptographic accelerator processor 14. waiting for 
that command to complete, and then issuing another command). Instead, the host 
processor 16 may load commands into command queues 44 and 46, which may then 
be processed in background by the cryptographic accelerator processor 14. Moreover, 
the size and number of command block sequences may be less constrained because 
1 5 the availability of system memory is generally more abundant. 

Referring now to FIGS. 3 - 5, the RNG execution unit 28, the E/A execution 
unit 32, and the PK engine execution unit 34 may use various registers that facilitate 
communication with the RN data queue 42 and the command queues 44 and 46. For 
example, as shown in BIG. 3, a controystatus register 62, a RN data queue base 
20 address register 64, a RN data queue size register 66, and a RN data queue pointer 
register 68 may be defined for use by the RNG execution unit 28. The controVstatus 
register 62 may include a self-test error field, which may be set if the RNG execution 
unit 28 generates two successive random number samples that are the same, and/or an 
error flag field, which may be used to notify the host processor 16 of an error on the 
25 system bus 26. The RN data queue base address register 64 may be used to hold the 
base address of the RN data queue 42 in the system memory 22. If the RN data queue 
42 does not have a fixed size, then the FN data queue size register 66 may be used to 
hold the size of the RN data queue 42. The RN data queue pointer register 68 may 
comprise a read pointer 72 portion and a write pointer 74 portion, which may be used 
30 by the RNG execution umt 28 and the host processor 16 as will be discussed in more 
detail hereinafter. 

As shown in FIG. 4, a controVstatus register 82, an E/A command queue base 
address register 84, an E/A command queue size register 86, and an E/A command 
queue pointer register 88 may be defmed for use by the E/A execution unit 32. The 

10 
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controystatus register 82 may include an interrupt flag field, which may be set if the 
host processor 16 requests an interrupt upon completion of a command block and/or if 
execution of a command block foils and/or an error flag field, which may be used to 
notify the host processor 16 of an error on the system bus 26. The E/A command 
5 queue base address register 84 may be used to hold the base address of the E/A 
command queue 44 in the system memory 22. If the E/A command queue 44 does 
not have a fixed size, then the E/A command queue size register 86 may be used to 
hold the size of the E/A command queue 44. The E/A command queue pointer 
register 88 may comprise a read pointer 92 portion and a write pointer 94 portion, 
10 w*ich may be used by the E/A execution unit 32 and the host processor 16, 
respectively, as will be discussed in more detail hereinafter. 

As shown in FIG. 5, a control/status register 102, a PK command queue base 
address register 104, a PK command queue size register 106, and a PK command 
queue pointer register 108 may be defined for use by the PK engine execution unit 34. 
15 The control/status register 102 may include an mterrupt flag field, vMch may be set if 
the host processor 16 requests an interrupt upon completion of a command block 
and/or if execution of a command block fails and/or an error flag field, which may be 
used to notify the host processor 16 of an error on the system bus 26. The PK 
command queue base address register 104 may be used to hold the base address of the 
20 PK command queue 46 in the system memory 22. If the PK command queue 46 does 
not have a fixed size, then the PK command queue size register 106 may be used to 
hold the size of the PK command queue 46. The PK command queue pomter register 
108 may comprise a read pointer 112 portion and a write pointer 114 portion, which 
may be used by the PK engine execution unit 34 and the host processor 16, 
25 respectively, as will be discussed in more detail hereinafter. 

Referring now to FIG. 6, operations for loadmg a command block into the 
E/A command queue 44 and/or the PK command queue 46, in accordance with 
embodiments of the present mvention, wiU be described in more detail hereafter. In 
general, the host processor 16 writes commands into the command queues 44 and 46 
30 beginning at write address locations stored in the write pointers for the respective 
command queues (e.g., write pointers 94 and 114). Before writing a command block 
into-a command queue. howeverj -Ae host processor determmes at block 122 whether 
tiie write address plus the command block size equals the read address stored in the 
conesponding read pointer 92 or 112. If the resuU determined at block 122 is "Yes." 

11 
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then the host processor 16 postpones loading a new command block into the 
command queue untU the cryptographic accelerator processor 14 has incremented the 
read address. If, however, the result determined at block 122 is "No." then the host 
processor 16 loads a command block mto the command queue at block 124 at the 
5 write address associated with the conmiand queue and then increments the write 
address at block 126 by an amount corresponding to the size of the loaded command 
block. The host processor 16 need not check the current read address every time a 
new command block is loaded. Instead, the host processor 16 may check the read 
address when the write address is getting close to the last value the host processor 16 
10 has for the read address. Checking the read address may be expensive in terms of 
processor cycles consumed. By checking the read address only when the read address 
is getting close to tlie write address (e.g., within a predefined threshold), host 
processor 16 cycles may be conserved. 

The foregoing operations are illustrated, for example, in FIGS. 7 and 8, which 
15 show embodiments of the E/A command queue 44 and the PK command queue 46, 
respectively. As shown in FIGS. 7 and 8, both the E/A command queue 44 and the 
PK command queue 46 are configured to hold m command blocks, which each 
comprise eight, thirty-two bit words. The host processor 16 has written a single 
command block into the first command block position (/. e. , the "0" position) and the 
20 write address has been incremented to point to the next empty command block slot. 
The addresses used in FIGS. 7 and 8 are based on command block slot numbers for 
purposes of illustration. These addresses may be converted into absolute addresses by 
multiplying the command block slot number by 256 and adding the resulting product 
to the respective base addresses for the command queues, which are stored in the E/A 
25 command queue base address register 84 and the PK command queue base address 
register 104, respectively. Note that the test used at block 122 of FIG. 6 to determme 
whether a new command block may be loaded mto a command queue implies that if a 
command queue may hold up to m command blocks, then only m -1 command blocks 
may be stored in the command queue at the same time. 
30 Referrmg now to FIG. 9, operations for executmg a command block that has 

been loaded mto the E/A command queue 44 and/or the PK command queue 46, m 
accordance with embodiments of the present invention, wiU be described m more 
detail hereafter. At block 132, the cryptographic accelerator processor 14 determines 
whether the write address is equal to the read address. Specifically, the E/A execution 
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unit 32 determines whether the write address is equal to the read address for the E/A 
command queue 44 and the PK engine execution unit 34 determines whether the write 
address is equal to the read address for the PK command queue 46. If the result 
determined at block 132 is "Yes," then the cryptographic processor 14 waits until the 
5 host processor 16 loads a new command block into the command queue. If, however, 
the result determined at block 132 is "No." then the cryptographic accelerator 
processor 14 downloads the command block at the read address associated with the 
command queue and executes the command block at block 134. In particular 
embodiments of the present invention, multiple command blocks may be downloaded 
10 for execution on the cryptographic accelerator processor 14 at the same time, which 
may further improve performance. ITie cryptographic accelerator processor 14 then 
increments the read address at block 136 by an amount conresponding to the size of 
the executed command block. 

Returning to FIGS. 7 and 8. the read addresses are set to point to the first 
15 command block slot, which has been loaded with a command block by the host 
processor 16. The E/A execution unit 32 and the PK engine execution unit 34 may 
read the command blocks loaded in the E/A command queue 44 and the PK command 
queue 46, respectively, with only mimmal interaction with the host processor 16, e.g., 
maintenance of the read pointers 92 and 112, and the write pointers 94, and 114. In 
20 general, the cryptographic accelerator processor 14 may continue to execute 

commands located in a circular command queue m system memory untU the read 
address equals the write address for that command queue. 

In accordance with further embodiments of the present invention, interaction 
between the host processor 16 and the cryptographic accelerator processor 14 may be 
25 fiirther reduced and overall system performance improved by including load and store 
commands in the cryptographic accelerator processor's command set Referring now 
to FIG. 10, a load command loads one or more operands from the system memory 22 
(e.g., the data buffer(s) 47) to the local memory 36 at block 142. The cryptographic 
accelerator processor 14 then performs one or more opemtions on the operand(s) at 
30 block 144 to generate a result that is stored in the local memory 36. A store command 
then stores the result in the system memory 22 at block 146. Advantageously, the 
host processor 16 need not consume processing time downloading operands to the 
cryptographic accelerator processor 14 and/or uploading results from the 

cryptographic accelerator processor 14 into the system memory 22. 
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To improve utilization of the chip area used to implement the cryptographic 
accelerator processor 14, at least a portion of the operands downloaded from the 
system memory 22 may be stored in the local memory 36. Instead of using a register 
number to identify the location of operands and results, an offset is used that identifies 
5 therelativepositionoftheoperandsandresultsinthelocalmemory36. Forexample. 
to perform the operation "a + b = c," a cryptographic accelerator processor 14 
instruction may indicate that "a" is at of&et 0 relative to a base address of the local 
memory 36, "b" is at offset 8 relative to the base address of the local memory 36, and 
the result "c" should be placed at offset 122 relative to the base address of the local 
10 memory 36. In accordance with further embodiments of the present invention, the 
result generated in the local memory 36 may also be stored in aresult field of a 
command block, which is located in one of the command queues 44 and 46 in the 
system memory 22. Advantageously, operands and results may be packed together 
into the local memory 36, which may conserve storage space. Because there is no 
15 wasted space in storing the operands and results in the local memory 36, memory 
utilization may be improved. If the cryptographic accelerator processor 14 needs to 
be redesigned to handle larger operands, then the local memory 36 may be easier to 
resize than resizing several registers. 

In accordance with further embodiments of the present invention, interaction • 
20 between &e host processor 16 and the cryptographic accelerator processor 14 may be 
further reduced and overall system performance improved by aUowmg the 
cryptographic accelerator processor 14 to inform the host processor 16 when 
command blocks have been executed. Referring now to FIG. 11, the host processor 
16 loads a command block mto one of the command queues 44 and 46 at block 152. 
25 As shown in HG. 12A, the command block may include an intermpt field, which 
may be set by the host processor 16 to turn an interrupt request on or oft The 
cryptographic accelerator processor 14 downloads the command block from one of 
the command queues 44 and 46 and executes the command hlock at block 154. The 
cryptographic accelerator processor 14 may optionally store error mformation in the 
30 command block as shown in FIG. 12B at block 156. The error information may 
comprise information that is associated with downloading the command block to the 
cryptographic accelerator processor 14 and/or executing the command block on the 
cryptographic accelerator processor 14. At block 158, if an mterrupt has been 
requested in the interrupt field of the command block, then the cryptographic 
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accelerator processor 14 invokes an interrupt to notify the host processor 16 that the 
command block has completed. 

In other embodiments of the present invention, instead of invoidng an interrupt 
to notify the host processor 16 that a command block has been executed, the 

5 cryptographic accelerator processor 14 may update a completion field in the 
command block as shown in HG. 12C. In addition, a periodic interrupt may be 
defined that upon each occurrence triggers the host processor 16 to check one or more 
of the command queues 44 and 46 to determine whether any of the command blocks 
stored therein have been executed by examining their completion fields. In still other 

10 embodiments of the present invention, the cryptographic accelerator processor 14 may 
store the results from executing a command block in the command block as shown in 
FIG. 12D. 

In still other embodiments of the present invention, the host processor 16 may 
set a timer when storing a command block into a conamand queue 42, 44. Upon 

15 expiration of the timer, the host processor 16 may check to determine whether the 
command block has been executed. Advantageously, the status of a command block 
may be determined by the host processor 16 without the need to process an interrupt • 
fiom the cryptographic accelerator processor 14. 

In accordance with fiirther embodiments of the present invention, improved 

20 utilization of the system memory 22 may be attained by re-using at least a portion of a 
command block that contams input data to store a result or output that is generated by 
an adjunct processor, such as the cryptographic accelerator processor 14, upon 
executing the command block. It is assumed that the size of the result or output is 
small enough to fit into the portion of the command block containing the input data 

25 that is to be overwritten, hi addition, the region of the command block in which the 
result or output is stored should be selected carefiilly to ensure that the input data that 
is overwritten is no longer needed by the host processor 16 after the command block 
has been executed by the adjunct processor. 

Referring now to FIG. 13, exemplary operations begin at block 162 where the 

30 host processor loads a conunand block that includes input data into one of the 

command queues 44 or 46 in the system memory 22. Note that that instead of or in 
addition to including input data mto the command block, the command block may 
mclude pointers to input data that reside, for example, m the data buffer(s) 47 in the 
system memory 22. An adjunct processor, such as the cryptographic accelerator 
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processor 14, may download the command block and perform one or more operations 
on the input data to generate a result at block 164. If the command block includes 
pointers to input data, then the data are separately downloaded to the cryptographic 
accelerator processor 14 using the input data pointers. The result is then stored in the 

5 command block in the system memory 22 at block 166 such that at least a portion of 
the input data is overwritten. Advantageously, the memory reserved for ttie command 
block in the system memory 22 may be reduced because additional storage space need 
not be reserved to store the result of exiscuting the command block either in the 
command block or elsewhere in the system memory 22. 

1 0 The foregoing operations are illustrated by way of example in JIGS. 14A and 

14B, which show an exemplary command block for decrypting an encrypted packet. 
Specifically, in FIG. 14A, a conunand block is shown that comprises a field that 
contains a hash key for the encrypted packet and another field that contains input 
information. The cryptographic accelerator processor 14 downloads the command 

1 5 block of FIG. 14 A and performs hash operations using the hash key and input 

information to generate a hash value. As shown hi BIG. 14B, this hash value is then 
stored in the command block in the system memory 22 by overwriting the input 
information, which is no longer needed once the hash value has been computed. Note 
that the input information may be one or more pointers to input data stored, for 

20 example, in the data buffer(s) 47 m the system memory 22 . 

In accordance with further embodhnents of the present mvention illustrated m 
FIG. 15, the command block may include an input pointer field and/or an output 
pomter field, which are used to identify the location of the encrypted packet in the 
system memory 22 and the location where the decrypted packet is to be stored in the 

25 system memory 22. For example, the cryptographic accelerator processor 14 may use 
the input pointer to download the encrypted packet fi-om the system memory 22 and 
may then decrypt the encrypted packet using the hash key and mput information to 
generate a hash value as discussed hereinabove. Note that the input information may 
be one or more pointers to input data stored, for example, in the data buflfer(s) 47 in 

30 the system memory 22. The hash value may be attached to the decrypted packet and 

the decrypted packet with the attached hash value may be stored in the system 

memory 22 at the address identified by the output pomter field in the command block. 

Cryptographic processors and/or other types of signal processors and 

integrated circuits may use a hardware-based random number generator. The 
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cryptogr^hic accelerator processor 14 may include a RNG execution unit 28 that 
may be used to generate random numbers for use by other execution units of the 
cryptographic accelerator processor 14 and/or the host processor 16. Exemplary 
operations that may be used to reduce mteraction between the host processor 16 and 

5 the cryptographic accelerator processor 14 and to improve overall system 

performance wiU be described hereafter. Referring now to FIG. 16. operations begin 
at block 172 where the cryptographic accelerator processor 14 loads a random number 
sample into the RN data queue 42 beginnmg at the write address stored in the write 
pointer field 74 of the RN data queue pointer register 68 (see FIG. 3). At block 174 

10 the host processor 16 reads the random number sample in the RN data queue 42 

beginning at the read address stored ui the read pointer field 72 of the RN data queue 
pointer register 68 (see FIG. 3). Thus, according to embodiments of the present 
invention, the host processor 16 need not spend time interacting directiy with the 
cryptographic accelerator processor 14 to request blocks of random data and/or 

15 reading random data from, for example, one or more registers on the cryptographic • 
accelerator processor 14 chip. 

Referring now to FIG. 17, operations for loading a random number sample 
into the RN data queue 42, in accordance with embodiments of the present invention, 
will be described in more detail hereafter. Before writing a random number sample 

20 into the RN data queue 42, the cryptographic accelerator processor 14 determines at 
block 182 whether the write address plus the random number sample size equals the 
lead address stored in the read pointer field 72. If the result determined at block 182 
is "Yes," then the cryptographic processor 14 postpones loading a new random 
number sample into the RN data queue 42 until the host processor 16 has incremented 

25 the read address. If, however, the result determined at block 182 is "No," then the 
cryptographic processor 14 loads a random number sample into the RN data queue 42 
at block 184 at the write address stored in the write pointer field 74 and then 
increments the write address at block 186 by an amount corresponding to the size of 
&e loaded random number sample. 

30 Note that in accordance with embodiments of the present invention, the 

cryptographic processor 14 may include a register and/or may recognize a command 
block that may be written to the cryptographic processor 14 that aUows the host 
processor 16 to, for example, provide the cryptographic processor 14 with a random 
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number seed and/or instruct the cryptographic processor 14 to begin generating 
random numbers. 

The foregoing operations are iUustrated, for exampie, in FIG. 18, which 
shows an exemplary embodiment of the RN data queue 42. As shown in FIG. 18. the 

5 RN data queue 42 is configured to hold 5 12 random number samples, which each 
comprise 64 bits. The cryptographic processor 14 has written four random number 
samples into addresses 1 through 4 and the write address has been incremented to 
point to Hie next available address, which is empty or contains data that have akeady 
been read by the host processor 16. The addresses shown in FIG. 18 are based on 

10 random number sample units for purposes of illustration. These addresses may be 
converted into absolute addresses by multiplying the random number sample number 
by 64 and adding the resulting product to the respective base address for the RN data 
queue 42, which is stored in the RN data queue base address register 64. Note that the 
test used at block 182 of FIG. 17 to determine whether a new random number sample 

1 5 may be loaded into the RN data queue 42 impUes that if the RN data queue 42 may 
hold up to m random number samples, then only m - 1 random number samples may 
be stored in the RN data queue 42 at the same time. Thus, if the RN data queue 42 is 
filled to its capacity, then it may hold 32,704 bits (511, 64-bit random number • 
samples), which exceeds the 20,000 bits required by the Federal Information 

20 Processing Standard (FIPS) 140-1, Security Requirements for Cryptographic Modules 

issued January 1 1 . 1994. 

Referring now to FIG. 19, operations for reading a command block that has 
been loaded into the RN data queue 42, in accordance with embodiments of the 
present invention, will be described in more detail hereafter. At block 192, the host 

25 processor 16 determines whether the write address is equal to the read address. If the 
result determined at block 192 is "Yes," then the host processor 16 waits until the 
cryptographic accelerator processor 14 loads a new random number sample into the 
RN data queue 42. If, however, the result determined at block 192 is "No." then the 
host processor 16 reads the random number sample at the read address stored m the 

30 read pointer field 72 at block 194. The host processor 16 then increments the read 
address at block 196 by an amount corresponding to the size of the random number 
sample. The host processor 16 need not check the current write address every tune a 
new random number sample is read. Instead, the host processor 16 may check the 



18 



wo 01/86430 PCT/USOl/15180 

write address when the read address is getting close to the last value the host 
processor 16 has for the write address. 

Thus, according to embodhnents of the present invention, a cryptographic 
accelerator processor 14 may provide random number samples for use by a host 

5 processor 16 with reduced interaction between the host processor 16 and the 

cryptographic accelerator processor 14. In general, the host processor 14 need only 
interact with the cryptographic accelerator processor 14 to update the read address and 
to check the value of the write address when the read address approaches the last 
value the host processor 14 has for the write address. Li addition, the cryptographic 

1 0 accelerator processor 14 may manage the buffering of the random number samples, 
which may conserve processor cycles of the host processor 16 and may reduce 
transactions on the system bus 26, which may improve overall system performance. 

The performance of cryptographic data processing systems may be affected by 
the system architecture and the methodology used to perfomi operations. For 

1 5 example, conventional cryptographic data processing systems may comprise one or 
more ASICs, such as the ASIC 202 shown in FIG. 20. The ASIC 202 comprises a 
plurality of functional units 204, 206, and 208, which are configured to perform 
specific operations. As shown in FIG. 20, however, input commands are provided to 
the ASIC 202 serially and then routed to the appropriate functional unit 204, 206, 

20 and/or 208. The outputs and/or results of executing the input commands are provided 
serially as command outputs firom the ASIC 202. Tlius, the ASIC 202 typically 
processes commands sequentially such that a first command must finish before a 
subsequent command may be processed even if the commands are executed by 
different fianctional units. 

25 Referring now to FIG. 21, the performance of cryptographic data processing 

systems may be improved, in accordance with embodiments of the present invention, 
by providing separate command interfaces that are respectively associated with the 
functional units such that each functional unit may receive command inputs and may 
generate command outputs and/or results independentiy of other fimctional units. As 

30 shown in FIG. 21, an ASIC 212 includes a plurality of functional units 214, 216, and 
218, which each receive command inputs through its own command interface and 
' generate outputs and/or results -that may-be communicated to- another processor 
through the command interface. By associating a separate command interface with 
each functional unit 214, 216, and 218, the functional units 214, 216, and 218 may 



wo 01/86430 PCT/USOl/15180 

operate independently and in parallel, thereby improving the performance of a 
cryptographic data processing system. 

Referring now to BIG. 22, the functional units 214, 216, and 218 may 
comprise the E/A execution unit 32, the RNG execution unit 28, and the PK engine 

5 execution unit 34. The E/A execution unit 32 comprises a command interface 
manager 222, the RNG execution unit 28 comprises a command interface manager 
224, and the PK engine execution unit 34 comprises a command interface manager 
226. These respective command interfece managers 222, 224, and 226 may be used 
to receive input command blocks from the E/A command queue 44, to transmit 

1 0 random number samples to tlie RN data queue 42, and to receive input command 
blocks from the PK command queue 46, respectively, and to allow the respective 
execution units 28, 32, and 34 to perform operations in parallel. 

Referring now to FIG. 23, operations of cryptographic data processing 
systems in which command interface managers are respectively associated with a 

1 5 plurality of functional units, in accordance with embodiments of the present 

invention, will be described hereafter. Operations begin at block 232 where one or 
more command blocks are provided to each of the ftmctional units, such as, for 
example, by providing command blocks in the E/A command queue 44 and the PK 
command queue 46 for the E/A execution unit 32 and the PK engine execution unit 

20 34, respectively. At block 234, the command blocks are sunultaneously executed by 
the functional units by accessmg the command blocks in parallel through, for 
example, the command interface manager 222 and the command interface manager 
226, which are associated witii tihie E/A execution unit 32 and the PK engine execution 
unit 34, respectively. 

25 Note that command blocks may be provided to the cryptographic processor 14 

in serial feshion over the system bus 24. Nevertheless, the cryptographic processor 14 
may distribute command blocks to the command interfece managers 222, 224, and 
226 associated with the execution units 32, 28, and 34, which may then process tiie 
' command blocks in parallel. 

30 For purposes of illustration, exemplary embodiments of tiie present invention 

have been discussed hereinabove in which operations related to random number 
generationi encryption/authentication, and public key generation are performed in 
parallel based on functional units defined therefor. It will be understood that the 
operations that may be performed in parallel may be adjusted based on requirements 

20 
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and/or needs. Moreover, commands may be provided to the command interface 
managers in a variety of ways. A processor may write conamands directly to the 
command interface managers or, alternatively, commands may be stored in a memory 
and the command interface managers may be provided with the addresses where they 

5 may retrieve the stored commands for execution. 

In summary, by performing operations in parallel using a plurality of 
functional units, the total number of operations that may be performed may be 
increased and the average latency for completing operations may be reduced. 

The flowcharts of FIGS. 2, 6, 9 - 11, 13, 16, 17, 19, and 23 illustrate the 

1 0 arcliitecture, functionality, and operations of possible embodiments of the 

cryptographic data processing system 12 of FIG. 1. In this regard, each block may 
represent a module, segment, or portion of code, which comprises one or more 
executable instructions for unplementing the specified logical function(s). It should 
also be noted that in some alternative embodiments, the functions noted in the blocks 

15 may occur out of the order noted in FIGS. 2, 6, 9 - 11, 13, 16, 17, 19, and 23. For 
example, two blocks shown in succession may in fact be executed substantially 
concurrently or the blocks may sometimes be executed in the reverse order, 
depending on the functionality involved. 

In concluding the detailed description, it should be noted that many variations 

20 and modijScations can be made to the preferred embodiments without substantially 
departing fi'om the principles of the present invention. All such variations and 
modifications are intended to be included herein within the scope of the present 
invention, as set forth in the following claims. 



21 



wo 01/86430 PCT/USOl/15180 

CLAIMS 

We claim: 

1 . A mefliod of operating a cryptographic data processing system that 
comprises a host processor, a system memory coupled to the host processor, and a 
cryptographic processor mtegrated circuit that comprises a local memory and is 
coupled to the host processor and the system memory, the method comprising: 

5 loading at least one operand from the system memory to the local memory; 

performing at least one operation on the at least one operand to generate a 
result in the local memory; and 

stormg the result generated in the local memory in the system memory. 

2. A method as recited in Claim 1 , wherein performing the at least one 
operation, and stormg the result are perforaied by the cryptographic processor without 
interaction with the host processor. 

3 . A cryptographic data processing system, comprising: 
a host processor; 

a system memory coupled to the host processor; and 

a cryptographic processor that comprises a local memory and is coupled to the 
5 host processor and the system memory, the cryptographic processor being 
programmed to load at least one operand jfrom the system memory to the local 
memory, perform at least one operation on the at least one operand to generate a result 
in the local memory, and store the result generated in the local memory in the system 
memory. 

4. A method of operating a cryptographic data processing system that 
comprises a host processor, a system memory coupled to the host processor, and a 
cryptographic processor integrated circuit that comprises a local memory and is 
coupled to the host processor and the system memory, the method comprising: 

5 providing a command queue in the system memory; 

loading a command block into the command queue using the host processor; 
executing the command block using the cryptographic processor; and 
notifying the host processor that the command block has been executed. 

22 
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5. A method as recited in Claim 4, further comprising: 

providing a read address for the command queue and a write address for the 
command queue; 

wherem loading the command block mto the command queue usmg the host 
5 processor comprises loading the command block into the command queue using the 
host processor beginning at the write address, and wherem executing the command 
block using the cryptographic processor comprises executing the command block 
using the cryptographic processor beginnmg at the read address. 

6. A method as recited in Claim 5, wherein loading the command block 
into the command queue using the host processor beginning at the write address 
comprises: 

determining if the write address plus an amount corresponding to a size of a 
5 single command block equals the read address; and 

loading the command block into the command queue using the host processor 
beginning at the write address if the write address plus the amount corresponding to 
the size of the single command block does not equal the read address. 

7. A method as recited in Claim 6, further comprising: 
incrementing the write address by the amount corresponding to the size of a 

single command block using the host processor after loading the command block into 
the command queue using the host processor begmning at the write address if the 
5 write address plus the amount corresponding to the size of the single command block 
does not equal the read address. 

8. A method as recited in Claim 5, wherein executing the command block 
using the cryptographic processor beginning at the read address comprises: 

determining whether the read address is equal to the write address; and 
executing tiie command block using the cryptographic processor beginning at 
5 the read address if the read address is not equal to the write address. 

9. A method as recited m Claim 8, further comprising: 
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incrementing the read address by an amount corresponding to a size of a single 
command block using the cryptographic processor after executing the command block 
using the cryptographic processor begmning at the read address. 

10. A method as recited in Claun 4, wherein notifying the host processor 
that the command block has been executed comprises invoking an interrupt using the 
cryptographic processor after executing the command block. 

11. A method as recited in Claim 4, wherein notifying the host processor 
that the command block has been executed comprises updating a completion field in 
the command block using the cryptographic processor. 

12. A method as recited in Claim 1 1 , fiirtlier comprising: 
providing a periodic interrupt; and 

reading the completion field using the host processor upon invocation of the 
periodic interrupt. 

13. A method as recited in Claim 4, wherein notifying the host processor 
that the command block has been executed comprises: 

setting a timer after loading the command block into the command queue 
using the host processor; and 
5 checking whether the command block has been executed after expbration of 

the timer. 

14. A method as recited in Claim 4, fiirther comprising: 

loadmg at least one operand from the command queue to the local memory; 
performing at least one operation on the at least one operand to generate a 
result in the local memory; and 
5 storing the result generated in the local memory in the command queue. 

15. A method of operating a cryptographic data processing system that 
comprises a host processor, a system memory coupled to the host processor, and a 
cryptographic processor integrated circuit that is coupled to the host processor and the 
system memory, the method comprising: 

24 



wo 01/86430 



PCT/USOl/15180 



5 providing a command queuie in the system memory; 

loading a command block into the command queue using the host processor; 
setting a value of an interrupt field in the command block to request an 
mtemipt when the command block has been executed; 

executing the conmiand block using the cryptographic processor, and 
1 0 invokmg an interrupt usmg the cryptographic processor after executing the 

command block if the mterrupt field in the command block is set to the value to 
request the interrupt. 

1 6. A method as recited in Claun 1 5, further comprising: 
storing error information in the command block tiiat is associated with 

executing the command block using the cryptographic processor. 

1 7. A method of operating a cryptographic data processing system that 
comprises a host processor, a system memory coupled to the host processor, and a 
cryptographic processor integrated circuit that is coupled to the host processor and the 
system memory, the method comprising: 

5 providing a command queue in the system memory; 

loading a command block into the command queue using the host processor; 
executmg the command block using the cryptographic processor; and 
updating a completion field in the command block using the cryptographic 
processor. 

18. A method as recited m Claim 1 7, further comprising: 
providing a periodic interrupt; and 

reading the completion field using the host processor upon invocation of the 
periodic interrupt 

1 9. A method as recited m Claun 17, further comprismg: 
storing error information in the conmiand block that is associated with 

executing the command block using the cryptographic processor. 

20. A method of operating a data processing system that comprises a host 
processor, a system memory coupled to the host processor, and an adjunct processor 

25 



wo 01/86430 PCT/USOl/15180 

integrated circuit that is coupled to the host processor and the system memory, the 
method comprising: 
5 providing a conmiand queue in the system memory; 

loading a command block into the command queue using the host processor, 
the command block comprising an input data field that contains input data; 

performing an operation based on the mput data usmg the adjunct processor to 
generate a result; and 

1 0 storing the result in the input data field such that at least a portion of the input 

data is overwritten. 

21. A metlaod as recited in Claim 20, wherein the data processing system 
comprises a cryptographic data processing system, the adjunct processor integrated 
circuit comprises a cryptographic processor integrated circuit, and perforaung the 
operation based on the input data comprises: 

5 performing a hash operation based on the input data using the cryptographic 

processor to generate a hash value, 

22. A method as recited in Claim 21 , wherein stormg the result in the input 
data field comprises : 

storing the hash value in the input data field such that the at least a portion of 
the input data is overwritten. 

23. A method as recited in Claim 21, wherein the conmiand block fiirther 
comprises an input pointer field that contains an address in the system memory of an 
incoming packet and wherein performing the hash operation comprises: 

performing the hash operation based on the input data and the incoming packet 
5 using the cryptographic processor to generate the hash value. 

24. A method as recited in Claim 23, wherein the command block fiirther 
comprises an output pointer field that contains an address in the system memory for 
storing a decrypted packet, the method further comprising: 

decrypting the-mcoming packet using-the cryptographic processor to. generate 
5 the decrypted packet; 

attaching the hash value to the decrypted packet; and 
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means for providing a periodic inteirupt; and 
means for reading the completion field using the host processor upon 
5 invocation of the periodic interrupt. 

42. A cryptographic data processing system as recited in Claim 33, 
wherein the means for notifying the host processor that the command block has been 
executed comprises: 

means for setting a timer after loading the command block into the command 
5 queue using the host processor; and 

means for checking whether the command block has been executed after 

expiration of the timer. 

43 . A cryptographic data processing system as recited in Claim 33, further 
comprising: 

means for loading at least one operand from the coimnand queue to fte local 
memory; 

5 means for performing at least one opemtion on the at least one operand to 

generate a result in the local memory; and 

means for storing the result generated in the local memory in the command 

queue. 

44. A cryptographic data processing system that comprises a host 
processor, a system memory coupled to the host processor, and a cryptographic 
processor integrated circuit that is coupled to the host processor and the system 
memory, the system further comprising: 

5 means for providmg a command queue in the system memory; 

means for loading a command block into the command queue using the host 
processor; 

means for setting a value of an interrupt field m the command block to request 
an interrupt when the command block has been executed; 
10 means for executing the command block using the cryptographic processor; 

and 
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Storing the decrypted packet with the attached hash value at the address in the 
system memory contained in the output pointer field. 

25. 'A method of operating a cryptographic data processing system that 
comprises a host processor, a system memory coupled to tiie host processor, and a 
cryptographic processor integrated circuit that comprises a local memory and is 
coupled to the host processor and the system memory, the method comprising: 
5 providing a command queue in the system memory; 

providing a read address for the command queue and a write address for the 
command queue; 

loadmg a random number sample into the command queue using the 
cryptographic processor beginning at the write address; and 
1 0 reading the random number sample using the host processor beginning at the 

read address. 



26. A method as recited m Claim 25, wherein loading the random number 
sample into the command queue using the cryptographic processor beginning at the 
write address comprises: 

detemining if the write address plus an amount corresponding to a size of a 
5 single random number sample equals the read address; and 

loading the random number sample into the conunand queue using the 
cryptographic processor beginning at the write address if the write address plus the 
amount corresponding to the size of the single random number sample does not equal 
the read address. 

27. A method as recited in Claim 26, further comprising: 
incrementing the write address by the amount corresponding to the size of a 

single random number sample using the cryptographic processor after loading the 
random number sample into the command queue using the cryptographic processor 
5 beginning at the write address if the write address plus the amount conesponding to 
the size of the smgle random number sample does not equal the read address- 

28. A method as recited in Claim 25, wherein reading the random number 

sample using the host processor begiiming at the read address comprises: 
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determining whether the read address is equal to the write address; and 
reading the random nimiber sample using the host processor beginning at the 
5 read address if the read address is not equal to the write address. 

29. A method as recited in Claim 28, further comprising: 
incrementing the read address by an amount corresponding to a size of a single 

random number sample using the host processor after reading the random number 
sample using the host processor begiiming at the read address. 

30. A method of operating a data processing system that comprises a host 
processor, a system memory coupled to the host processor, and an adjunct processor 
integrated circuit that is coupled to the host processor and the system memory, the 
method comprising: 

5 transferring information between the host processor and the adjunct processor 

using the system memory. 

31. A cryptographic data processing system that comprises a host 
processor, a system memory coupled to the host processor, and a cryptographic 
processor integrated circuit that comprises a local memory and is coupled to the host 
processor and the system memory, the system further comprising: 

5 means for loading at least one operand from the system memory to the local 

memory; 

means for performing at least one operation on the at least one operand to 
generate a result in the local memory; and 

means for storing the result generated in the local memory in the system 
10 memory. 

32. A cryptographic data processing system as recited in Claim 31, 
wherein the means for performing the at least one operation, and the means for storing 
the result execute without interaction with the host processor, 

3 3 . A cryptographic data processing system. that comprises a host 
processor, a system memory coupled to the host processor, and a cryptographic 
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processor integrated circuit that comprises a local memory and is coupled to the host 
processor and the system memory, the system further comprising: 
5 means for providing a command queue in the system memory; 

means for loading a command block into the command queue using the host 
processor; 

means for executing the command block using the cryptographic processor; 

and 

1 0 means for notifying the host processor that the command block has been 

executed. 

34. A cryptographic data processing system as recited in Claim 33, further 
comprising: 

means for providing a read address for the command queue and a write 
address for the command queue; 
5 wherein the means for loading the command block into the command queue 

using the host processor comprises means for loading the command block into the 
command queue using the host processor beginning at the write address, and wherein 
the means for executing the command block using the cryptographic processor 
comprises means for executing the command block using the cryptographic processor 
10 begmning at the read address. 

35. A cryptographic data processing system as recited in Claim 34, 
wherein the means for loading the command block into the conmiand queue using the 
host processor beginning at the write address comprises: 

means for determining if the write address plus an amomt corresponding to a 
5 size of a single command block equals the read address; and 

means for loading the command block mto the command queue using the host 
processor beginning at the write address if the write address plus the amount 
corresponding to the size of the single command block does not equal the read 
address. 

3 6t - A cryptographic data processing system as recited in Claim 3 5, further 
comprising: 
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means for incrementing the write address by the amount corresponding to the 
size of a single command block using the host processor if the write address plus the 
5 amount corresponding to the size of the single command block does not equal the read 
address, the means for incrementing being responsive to the means for loading the 
command block into the command queue using the host processor beginning at the 
write address. 

37. A cryptographic data processing system as recited in Claim 34, 
wherein the means for executing the command block using the cryptographic 
processor beginning at the read address comprises: 

means for determining whether the read address is equal to the write address; 

5 and 

means for executing the command block using the cryptographic processor 
beginning at the read address if the read address is not equal to the write address. 

38. A cryptographic data processing system as recited in Claim 37, further 
comprising: 

means for incrementing the read address by an amount corresponding to a size 
of a single command block using the cryptographic processor, the means for 
5 incrementing being responsive to the means for executing the command block using 
the cryptographic processor beginning at the read address. 

39. A cryptographic data processing system as recited in Claim 33, 
wherein the means for notifying the host processor that the command block has been 
executed comprises means for invoking an interrupt using the cryptographic processor 
after executing the command block. 

40. A cryptographic data processuig system as recited in Claun 33, 
wherein the means for notifying the host processor that the command block has been 
executed comprises means for updating a completion field in the command block 
using the cryptographic processor. 



41. 

comprising: 



A cryptographic data processing system as recited in Claim 40, further 
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is coupled to the host processor and the system memory, the system further 
comprising: 

5 means for providing a command queue in the system memory; 

means for loading a command block into the command queue using the host 
processor, the command block comprising an input data field that contains input data; 

means for perfomiing an operation based on the input data using the adjunct 
processor to generate a result; and 
10 means for storing the result in the input data field such that at least a portion of 

the input data is overwritten. 

50. A data processmg system as recited in Claim 49, wherein the data 
processing system comprises a cryptographic data processing system, the adjunct 
processor integrated circuit comprises a cryptographic processor integrated circuit, 
and the means for performmg the operation based on Ihe input data comprises: 

5 means for performing a hash operation based on the mput data using the 

cryptographic processor to generate a hash value. 

51. A data processing system as recited in Claim 50, wherein the means 
for storing the result in the input data field comprises: 

means for storing the hash value in the input data field such that the at least a 
portion of the input data is overwritten. 

52. A data processing system as recited in Claim 50, wherein the command 
block fijTther comprises an input pointer field that contains an address in the system 
memory of an incoming packet and wherein the means for performing the hash 
operation comprises: 

5 means for performing the hash operation based on the input data and the 

incoming packet using the cryptographic processor to generate the hash value. 

53 . A data processing system as recited in Claim 52, wherein the command 
block fiirther comprises an output pointer field that contains an address in the system 
memory for storing a decrypted packet, ttie data processing system further 
comprising: 
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5 means for decrypting the incoming packet using the cryptographic processor 

to generate the decrypted packet; 

means for attaching the hash value to the decrypted packet; and 
means for stormg the decrypted packet with the attached hash value at the 
address in the system memory contained in the output pointer field. 



54. A cryptographic data processing system that comprises a host 
processor, a system memory coupled to the host processor, and a cryptographic 
processor integrated circuit that comprises a local memory and is coupled to the host 
processor and the system memory, the system further comprising: 

5 means for providing a command queue in the system memory; 

means for providing a read address for the command queue and a write 
address for the command queue; 

means for loading a random number sample into the command queue using the 
cryptographic processor beginning at the write address; and 
1 0 means for reading the random number sample using the host processor 

beginrdng at the read address. 

55. A cryptographic data processing system as recited in Claim 54, 
wherein the means for loading the random number sample into the command queue 
using the cryptographic processor begiiming at the write address comprises: 

means for determining if the write address plus an amount corresponding to a 
5 size of a single random number sample equals the read address; and 

means for loading the random number sample into the command queue using 
the cryptographic processor beginning at the write address if the write address plus 
the amount corresponding to the size of the single random number sample does not 
equal the read address. 

56. A cryptographic data processing system as recited in Claim 55, further 
comprising: 

means for incrementing the write address by the amount corresponding to the 

size of a single fiSidoin liiimWsample using the 

5 address plus the amount corresponding to the size of the single random number 

sample does not equal the read address, the means for incrementiag bemg responsive 
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to the means for loading the random number sample into the command queue using 
the cryptographic processor beginning at the write address. 



57. A cryptographic data processing system as recited in Claim 54, 
v^erein the means for reading the random number sample using the host processor 
beginning at the read address comprises: 

means for determinmg whether the read address is equal to the write address; 

S and 

means for reading the random number sample usmg the host processor 
beginning at the read address if the read address is not equal to the write address. 

58. A cryptographic data processing system as recited in Claim 57, further 
comprising: 

means for incrementing the read address by an amount corresponding to a size 
of a single random number sample using the host processor, the means for 
5 incrementing being responsive to the means for reading the random number sample 
using the host processor beginning at the read address. 

59. A computer program product for operating a cryptographic data 
processing system that comprises a host processor, a system memory coupled to the 
host processor, and a cryptographic processor integrated circuit that comprises a local 
memory and is coupled to the host processor and the system memory, the computer 

5 program product comprising: 

a computer readable program medium having computer readable program 
code embodied therein, the computer readable program code comprising: 

computer readable program code for loadmg at least one operand from the 
system memory to the local memory; 
1 0 computer readable program code for performing at least one operation on the 

at least one operand to generate a result in the local memory; and 

computer readable program code for storing the result generated in the local 
memory in the system memory. 

60. A computer program product as recited in Claim 59, wherein the 

computer readable program code for performing the at least one operation, and the 
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computer readable program code for storing the result execute without interaction 
with the host processor. 

61 . A computer program product for operating a cryptographic data 
processing system that comprises a host processor, a system memory coupled to the 
host processor, and a cryptographic processor integrated circuit that comprises a local 
memory and is coupled to the host processor and the system memory, the computer 

5 program product comprising: 

a computer readable program medium having computer readable program 
code embodied therein, the computer readable program code comprising: 

computer readable program code for providing a command queue in the 
system memory; 

1 0 computer readable program code for loading a command block into the 

command queue using the host processor; 

computer readable program code for executing the command block using the 
cryptographic processor; and 

computer readable program code for notifying the host processor that the 
1 5 command block has been executed. 

62 . A computer program product as recited in Claim 6 1 , further 
comprising: 

computer readable program code for providing a read address for the 
command queue and a write address for the command queue; 
5 wherein the computer readable program code for loading the command block 

into the command queue using the host processor comprises computer readable 
program code for loading the command block into the command queue using the host 
processor beginning at the write address, and wherein the computer readable program 
code for executing the command block using the cryptographic processor comprises 
1 0 computer readable program code for executing flie command block using the 
cryptographic processor beginning at the read address. 

6S: A computer program product as recited in Claim 62, wherein the 

computer readable program code for loading the command block into the command 

queue using the host processor beginning at the write address comprises: 
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computer readable program code for determining if the write address plus an 
5 amonnt corresponding to a size of a single command block equals the read address; 
and 

computer readable program code for loading the command block into the 
command queue using the host processor beginning at the write address if the write 
address plus the amount correspondmg to the size of the single command block does 
1 0 not equal the read address. 

64. A computer program product as recited in Claim 63, further 
comprising: 

computer readable program code for incrementing the write address by the 
amount corresponding to the size of a single command block using the host processor 
5 if the write address plus the amount corresponding to the size of the single command 
block does not equal the read address, the computer readable program code for 
incrementing being responsive to the computer readable program code for loading the 
command block into the command queue using the host processor beginning at the 
write address. 

65. A computer program product as recited in Claim 62, wherein the 
computer readable program code for executing the command block using the 
cryptographic processor beginning at the read address comprises: 

computer readable program code for determming whether the read address is 
5 equal to the write address; and 

computer readable program code for executing the command block using the 
cryptographic processor beginning at the read address if the read address is not equal 
to the write address. 

66. A computer program product as recited in Claim 65, further 
comprising: 

computer readable program code for incrementing the read address by an 

amount corresponding to a size of a single command block using the cryptographic 

5 processor, the computer readable program-code for incrementing being responsive to 

the computer readable program code for executing the command block usmg the 

cryptographic processor beginning at the read address. 
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67. A computer program product as recited in Claim 61 , wherein the 
computer readable program code for notifying the host processor that the command 
block has been executed comprises computer readable program code for invoking an 
interrupt using the cryptographic processor after executing the command block. 

68. A computer program product as recited in Claim 61 , wherein the 
computer readable program code for notifying the host processor that the command 
block has been executed comprises computer readable program code for updating a 
completion field m the command block using the cryptographic processor. 

69. A computer program product as recited in Claim 68, further 
comprising: 

computer readable program code for providing a periodic interrupt; and 
computer readable program code for reading the completion field using the 
5 host processor upon invocation of the periodic interrupt. 

70. A method as recited in Claim 61, wherein the computer readable 
program code for notifying the host processor that the command block has been 
executed comprises: 

computer readable program code for setting a timer after loading the command 
5 block into the command queue using the host processor; and 

computer readable program code for checking whether the command block 
has been executed after expiration of the timer. 

71. A computer program product as recited in Claim 6 1 , fiirther 
comprising: 

computer readable program code for loading at least one operand from the 
command queue to the local memory; 
5 computer readable program code for performing at least one operation on the 

at least one operand to generate a result in the local memory; and 

computer readable program code for storing the result generated in the local 
memory in the command queue. 
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72. A compute- program product for operating a cryptographic data 
processing system that comprises a host processor, a system memory coupled to the 
host processor, and a cryptographic processor integrated circuit that is coupled to the 
host processor and the system memory, the computer program product comprising: 

5 a computer readable program medium having computer readable program 

code embodied therein, the computer readable program code comprising: 

computer readable program code for providing a command queue in the 
system memory; 

computer readable program code for loading a command block into the 
1 0 command queue using the host processor; 

computer readable program code for setting a value of an interrupt field in the 
command block to request an interrupt when the command block has been executed; 

computer readable program code for executing the command block using the 
cryptographic processor; and 
15 computer readable program code for invoking an interrupt using the 

cryptographic processor after executing the command block if the interrupt field in the 
command block is set to the value to request the interrupt. 

73. A computer program product as recited in Claim 72, further 
comprising: 

computer readable program code for storing error information in the command 
block that is associated with executing the command block using the cryptographic 
5 processor. 

74. A computer program product for operating a cryptographic data 
processing system that comprises a host processor, a system memory coupled to the 
host processor, and a cryptographic processor integrated circuit that is coupled to the 
host processor and the system memory, the computer program product comprising: 

5 a computer readable program medium havrug computer readable program 

code embodied therein, the computer readable program code comprismg: 

computer readable program code for providing a conmiand queue in the 
system memory;- 

computer readable program code for loading a command block into the 
1 0 command queue using the host processor; 
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computer readable program code for executing the command block using the 
cryptographic processor; and 

computer readable program code for updating a completion field in the 
command block using the cryptographic processor. 

75. A computer program product as recited in Claim 74, fiirther 
comprising: 

computer readable program code for providing a periodic interrupt; and 
computer readable program code for reading the completion field using the 
5 host processor upon invocation of the periodic interrupt 

76. A computer program product as recited in Claim 74, fiirfher 
comprising: 

computer readable program code for storing error information in the command 
block that is associated with executing the command block using the cryptographic 
5 processor. 

77. A computer program product for operating a data processing system 
that comprises a host processor, a system memory coupled to the host processor, and 
an adjunct processor integrated circuit that is coupled to the host processor and the 
system memory, the computer program product comprising: 

5 a computer readable program medium having computer readable program 

code embodied therein, the computer readable program code comprising: 

computer readable program code for providing a conunand queue in the 
system memory; 

computer readable program code for loading a command block into the 
1 0 command queue using the host processor, the command block comprising an input 
data field that contains input data; 

computer readable program code for performing an opoation based on the 
input data using the adjunct processor to generate a result; and 

computer readable program code for storing the result in the input data field 
15 such that at least apportion of the iiiput data is overwritten. 
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78. A computer program product as recited in Claim 77, wherein the data 
processing system comprises a cryptographic data processing system, the adjunct 
processor integrated circuit comprises a cryptographic processor integrated circuit, 
and the computer readable program code for performing the operation based on the 

5 input data comprises: 

computer readable program code for performing a hash operation based on the 
input data usiiig the cryptographic processor to generate a hash value. 

79. A computer program product as recited in Claim 78, wherein the 
computer readable program code for storing the result in the input data field 
comprises: 

computer readable program code for storing the hash value in the input data 
5 field such that the at least a portion of the input data is overwritten. 

80. A computer program product as recited in Claim 78, wherein the 
command block fiirther comprises an input pointer field that contains an address in 
the system memory of an incoming packet and wherein the computer readable 
program code for performing the hash opemtion comprises: 

5 computer readable program code for performing the hash operation based on 

the hxpvit data and the incoming packet usmg the cryptographic processor to generate 
the hash value. 

81 . A computer program product as recited in Claim 80, wherein the 
command block further comprises an output pointer field tiiat contains an address in 
the system memory for storing a decrypted packet, the computer program jjroduct 
fiirther comprising: 

5 computer readable program code for decrypting the incoming packet usmg the 

cryptogmphic processor to generate the decrypted packed 

computer readable program code for attaching the hash value to the decrypted 
packet; and 

computer readable program code for storing the decrypted packet with the 
1 0 - attached hash value at the address in the system memory contained m the output 
pointer field. 
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means for invoking an interrupt using the cryptographic processor after 
executing the command block if the interrupt field in the command block is set to the 
value to request the interrupt. 

45. A cryptographic data processmg system as recited in Claim 44, further 
comprising: 

means for storing error information in the command block that is associated 
with executing the command block using the cryptographic processor. 

46. A cryptographic data processing system that comprises a host 
processor, a system memory coupled to the host processor, and a cryptographic 
processor integrated circuit that is coupled to the host processor and the system 
memory, the system further comprising: 

means for providing a command queue in the system memory; 
means for loading a command block into the coinmand queue using the host 
processor; 

means for executing the command block using the cryptographic processor; 

and 

means for updating a completion field in the command block using the 
cryptographic processor. 

47. A cryptographic data processing system as recited in Claim 46, further 
comprising: 

means for providing a periodic interrupt; and 
means for reading the completion field using the host processor upon 
5 invocation of the periodic interrupt 

48. A cryptographic data processing system as recited in Claim 46, further 
comprising: 

means for storing error information in the command block that is associated 
with executing the command block using the cryptographic processor, 

49. A data processing system that comprises a host processor, a system 
memory coupled to the host processor, and an adjimct processor integrated circuit that 
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82. A computer program product for operating cryptographic data 
processing system that comprises a host processor, a system memory coupled to the 
host processor, and a oyptographic processor integrated circuit that comprises a local 
memory and is coupled to the host processor and the system memory, the computer 

5 program product comprising: 

a computer readable program medium having computer readable program 
code embodied therem, the computer readable program code comprising: 

computer readable program code for providing a command queue in the 
system memory; 

1 0 computer readable program code for providmg a read address for the 

command queue and a write address for the command queue; 

computer readable program code for loading a random number sample into Ihe 
command queue using the cryptographic processor begiiming at the write address; and 
computer readable program code for reading Ihe random number sample using 
15 the host processor beginning at the read address. 

83. A computer program product as recited in Claim 82, wherein the 
computer readable program code for loading the random number sample into the 
command queue using the cryptographic processor beginning at the write address 
comprises: 

5 computer readable program code for determining if the write address plus an 

amount corresponding to a size of a single random number sample equals the read 
address; and 

computer readable program code for loading the random number sample into 
the command queue using the cryptographic processor beginning at the write address 
10 if the write address plus the amount corresponding to the size of the single random 
number sample does not equal the read address. 

84. A computer program product as recited in Claim 83 , further 
comprising: 

computer readable program code for incrementing the write address by the 

amount corresponding to the size of a single random number sample using the 

5 cryptographic processor if the write address plus the amount corresponding to the size 

of the single random number sample does not equal the read address, the computer 
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readable program code for incrementing bdng responsive to tfie computer readable 
program code for loading the random number sample into the command queue using 
the cryptographic processor beginning at flie write address. 

85 . A computer program product as recited in Claim 82, wherein the 
computer readable program code for reading the random nimiber sample iising the 
host processor beginning at the read address comprises: 

computer readable program code for determining whether the read address is 
5 equal to the write address; and 

computer readable program code for reading the random number sample using 
the host processor beginning at the read address if the read address is not equal to the 
write address. 

86. A computer program product as recited in Claim 85, further 
comprising: 

computer readable program code for incrementing the read address by an 
amount corresponding to a size of a single random number sample using the host 
5 processor, the computer readable program code for incrementing being responsive to 
the computer readable program code for reading the random number sample using the 
host processor begiiming at the read address. 
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